[SPARK-34575][SQL] Push down limit through window when partitionSpec is empty #31691

wangyum · 2021-03-01T05:53:14Z

What changes were proposed in this pull request?

Push down limit through Window when the partitionSpec of all window functions is empty and the same order is used. This is a real case from production:

This pr support 2 cases:

All window functions have same orderSpec:

SELECT *, ROW_NUMBER() OVER(ORDER BY a) AS rn, RANK() OVER(ORDER BY a) AS rk FROM t1 LIMIT 5;
== Optimized Logical Plan ==
Window [row_number() windowspecdefinition(a#9L ASC NULLS FIRST, specifiedwindowframe(RowFrame,          unboundedpreceding$(), currentrow$())) AS rn#4, rank(a#9L) windowspecdefinition(a#9L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rk#5], [a#9L ASC NULLS FIRST]
+- GlobalLimit 5
   +- LocalLimit 5
      +- Sort [a#9L ASC NULLS FIRST], true
         +- Relation default.t1[A#9L,B#10L,C#11L] parquet

There is a window function with a different orderSpec:

SELECT a, ROW_NUMBER() OVER(ORDER BY a) AS rn, RANK() OVER(ORDER BY b DESC) AS rk FROM t1 LIMIT 5;
== Optimized Logical Plan ==
Project [a#9L, rn#4, rk#5]
+- Window [rank(b#10L) windowspecdefinition(b#10L DESC NULLS LAST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rk#5], [b#10L DESC NULLS LAST]
   +- GlobalLimit 5
      +- LocalLimit 5
         +- Sort [b#10L DESC NULLS LAST], true
            +- Window [row_number() windowspecdefinition(a#9L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rn#4], [a#9L ASC NULLS FIRST]
               +- Project [a#9L, b#10L]
                  +- Relation default.t1[A#9L,B#10L,C#11L] parquet

Why are the changes needed?

Improve query performance.

spark.range(500000000L).selectExpr("id AS a", "id AS b").write.saveAsTable("t1")
spark.sql("SELECT *, ROW_NUMBER() OVER(ORDER BY a) AS rowId FROM t1 LIMIT 5").show

Before this pr	After this pr

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test.

SparkQA · 2021-03-01T06:55:21Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40160/

SparkQA · 2021-03-01T07:30:36Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40160/

SparkQA · 2021-03-01T10:08:40Z

Test build #135579 has finished for PR 31691 at commit d04b678.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-03-01T14:26:28Z

Test build #135590 has finished for PR 31691 at commit 512dd54.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

c21 · 2021-03-01T22:40:08Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

+            WindowSpecDefinition(Nil, orderSpec,
+                SpecifiedWindowFrame(RowFrame, UnboundedPreceding, CurrentRow))), _)), _, _, child))
+        if child.maxRows.forall( _ > limitVal) =>
+      LocalLimit(


Do we still need the LocalLimit here? We already restrict the window expression to be RankLike and RowNumber, so we know the number of rows will not change before & after window, right?

It should be optimized by EliminateLimits later. Otherwise, the plan cannot be further optimized. Like this:

!GlobalLimit 2 !+- Window [row_number() windowspecdefinition(c#0 DESC NULLS LAST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rn#0], [c#0 DESC NULLS LAST] ! +- GlobalLimit 2 ! +- LocalLimit 2 ! +- Sort [c#0 DESC NULLS LAST], true ! +- LocalRelation [a#0, b#0, c#0]

c21 · 2021-03-01T22:41:19Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

+        if child.maxRows.forall( _ > limitVal) =>
+      LocalLimit(
+        limitExpr = limitExpr,
+        child = window.copy(child = Limit(limitExpr, Sort(orderSpec, true, child))))


Wondering why do we need an extra Sort here? Shouldn't physical plan rule EnsureRequirements add the sort between window and limit?

Yes, Sort is needed because we need global sort.

If no partitionSpec specified, I think the planner inserts an exchange for making a single partition. So, we don't need a global sort here?

The current logic is sort first and then limit:

== Physical Plan == AdaptiveSparkPlan isFinalPlan=false +- Window [row_number() windowspecdefinition(a#10L ASC NULLS FIRST, b#11L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rowId#8], [a#10L ASC NULLS FIRST, b#11L ASC NULLS FIRST] +- TakeOrderedAndProject(limit=5, orderBy=[a#10L ASC NULLS FIRST,b#11L ASC NULLS FIRST], output=[a#10L,b#11L]) +- FileScan parquet default.t1[a#10L,b#11L] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/Users/yumwang/opensource/spark/spark-warehouse/org.apache.spark...., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:bigint,b:bigint>

If remove sort. The logic is limit first and then sort:

== Physical Plan == AdaptiveSparkPlan isFinalPlan=false +- Window [row_number() windowspecdefinition(a#10L ASC NULLS FIRST, b#11L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rowId#8], [a#10L ASC NULLS FIRST, b#11L ASC NULLS FIRST] +- Sort [a#10L ASC NULLS FIRST, b#11L ASC NULLS FIRST], false, 0 +- GlobalLimit 5 +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [id=#30] +- LocalLimit 5 +- FileScan parquet default.t1[a#10L,b#11L] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/Users/yumwang/opensource/spark/spark-warehouse/org.apache.spark...., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:bigint,b:bigint>

Ah, I see. I got it. Thanks. I think its better to leave some comments about why we need it here.

Thanks for explanation @wangyum , +1 for adding some comments.

Yes. d1839b9#diff-11264d807efa58054cca2d220aae8fba644ee0f0f2a4722c46d52828394846efR624

SparkQA · 2021-03-02T03:56:53Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40198/

SparkQA · 2021-03-02T04:26:05Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40198/

SparkQA · 2021-03-02T07:24:26Z

Test build #135619 has finished for PR 31691 at commit d1839b9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

wangyum · 2021-03-03T00:37:33Z

cc @cloud-fan

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

cloud-fan · 2021-03-03T07:03:57Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

+
+    // Adding an extra Limit below WINDOW when there is only one RankLike/RowNumber
+    // window function and partitionSpec is empty.
+    case LocalLimit(limitExpr @ IntegerLiteral(limitVal),


shall we respect TOP_K_SORT_FALLBACK_THRESHOLD here?

cloud-fan · 2021-03-03T07:09:14Z

I'm trying to understand the before/after data flow.

Before: input -> shuffle to one partition -> local sort and run rank function -> limit
After: input -> global top k (shuffle to one partition) -> run rank function -> limit

The optimization makes sense, but seems like we can remove the final limit?

More questions: why only allow a single rank function? what's the requirement of the window frame?

wangyum · 2021-03-03T08:49:57Z

After: input -> global top k (shuffle to one partition) -> run rank function -> limit

EliminateLimits will remove the final limit.
Before:

== Optimized Logical Plan ==
GlobalLimit 5, Statistics(sizeInBytes=140.0 B, rowCount=5)
+- LocalLimit 5, Statistics(sizeInBytes=1731.0 B)
   +- Window [row_number() windowspecdefinition(a#10L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rowId#8], [a#10L ASC NULLS FIRST], Statistics(sizeInBytes=1731.0 B)
      +- Relation default.t1[a#10L,b#11L] parquet, Statistics(sizeInBytes=1484.0 B)

== Physical Plan ==
AdaptiveSparkPlan isFinalPlan=false
+- CollectLimit 5
   +- Window [row_number() windowspecdefinition(a#10L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rowId#8], [a#10L ASC NULLS FIRST]
      +- Sort [a#10L ASC NULLS FIRST], false, 0
         +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [id=#25]
            +- FileScan parquet default.t1[a#10L,b#11L] Batched: true, DataFilters: [], Format: Parquet, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:bigint,b:bigint>

After:

== Optimized Logical Plan ==
Window [row_number() windowspecdefinition(a#10L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rowId#8], [a#10L ASC NULLS FIRST], Statistics(sizeInBytes=140.0 B)
+- GlobalLimit 5, Statistics(sizeInBytes=120.0 B, rowCount=5)
   +- LocalLimit 5, Statistics(sizeInBytes=1484.0 B)
      +- Sort [a#10L ASC NULLS FIRST], true, Statistics(sizeInBytes=1484.0 B)
         +- Relation default.t1[a#10L,b#11L] parquet, Statistics(sizeInBytes=1484.0 B)

== Physical Plan ==
AdaptiveSparkPlan isFinalPlan=false
+- Window [row_number() windowspecdefinition(a#10L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rowId#8], [a#10L ASC NULLS FIRST]
   +- TakeOrderedAndProject(limit=5, orderBy=[a#10L ASC NULLS FIRST], output=[a#10L,b#11L])
      +- FileScan parquet default.t1[a#10L,b#11L] Batched: true, DataFilters: [], Format: Parquet, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:bigint,b:bigint>

cloud-fan · 2021-03-03T08:52:27Z

If we know the final limit will always be removed, why we add it in the first place?

wangyum · 2021-03-03T10:19:26Z

If we know the final limit will always be removed, why we add it in the first place?

Fixed by b328375. This needs a small change to EliminateLimits:

SparkQA · 2021-03-03T12:02:37Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40287/

SparkQA · 2021-03-03T12:37:22Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40287/

SparkQA · 2021-03-03T15:09:00Z

Test build #135705 has finished for PR 31691 at commit b328375.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2021-03-03T15:31:07Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

+    // Adding an extra Limit below WINDOW when there is only one RankLike/RowNumber
+    // window function and partitionSpec is empty.
+    case LocalLimit(limitExpr @ IntegerLiteral(limit),
+        window @ Window(Seq(Alias(WindowExpression(_: RankLike | _: RowNumber,


why only allow one rank?

OK. Add support for multiple window functions if the partitionSpec of all window functions is empty and the same order is used. For example:

val numRows = 10 spark.range(numRows).selectExpr("IF (id % 2 = 0, null, id) AS a", s"${numRows} - id AS b", "id AS c").write.saveAsTable("t1") spark.sql("SELECT *, ROW_NUMBER() OVER(ORDER BY a) AS rn, DENSE_RANK() OVER(ORDER BY a) AS rk FROM t1 LIMIT 5").explain("cost")

Before:

GlobalLimit 5, Statistics(sizeInBytes=200.0 B, rowCount=5) +- LocalLimit 5, Statistics(sizeInBytes=2.4 KiB) +- Window [row_number() windowspecdefinition(a#16L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rn#11, dense_rank(a#16L) windowspecdefinition(a#16L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rk#12], [a#16L ASC NULLS FIRST], Statistics(sizeInBytes=2.4 KiB) +- Relation default.t1[a#16L,b#17L,c#18L] parquet, Statistics(sizeInBytes=1994.0 B)

After:

Window [row_number() windowspecdefinition(a#16L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rn#11, dense_rank(a#16L) windowspecdefinition(a#16L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rk#12], [a#16L ASC NULLS FIRST], Statistics(sizeInBytes=200.0 B) +- GlobalLimit 5, Statistics(sizeInBytes=160.0 B, rowCount=5) +- LocalLimit 5, Statistics(sizeInBytes=1994.0 B) +- Sort [a#16L ASC NULLS FIRST], true, Statistics(sizeInBytes=1994.0 B) +- Relation default.t1[a#16L,b#17L,c#18L] parquet, Statistics(sizeInBytes=1994.0 B)

cloud-fan · 2021-03-03T15:31:25Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

+    case LocalLimit(limitExpr @ IntegerLiteral(limit),
+        window @ Window(Seq(Alias(WindowExpression(_: RankLike | _: RowNumber,
+            WindowSpecDefinition(Nil, orderSpec,
+                SpecifiedWindowFrame(RowFrame, UnboundedPreceding, CurrentRow))), _)), _, _, child))


why the frame must be UnboundedPreceding, CurrentRow?

Removed this check.

SparkQA · 2021-03-04T09:42:49Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40328/

SparkQA · 2021-03-04T09:51:42Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40328/

SparkQA · 2021-03-04T13:45:55Z

Test build #135746 has finished for PR 31691 at commit eb081c1.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-03-08T05:19:20Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40434/

SparkQA · 2021-03-08T05:55:04Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40434/

SparkQA · 2021-03-08T09:09:06Z

Test build #135852 has finished for PR 31691 at commit ee3a782.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

cloud-fan · 2021-03-08T10:20:42Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

+  private def isSupportPushdownThroughWindow(
+      windowExpressions: Seq[NamedExpression]): Boolean = windowExpressions.forall {
+    case Alias(WindowExpression(_: RankLike | _: RowNumberLike,
+        WindowSpecDefinition(Nil, _, _)), _) => true


Does it really work with any kind of window frames? Can we add some comments to explain it?

The window frame of RankLike and RowNumberLike is UNBOUNDED PRECEDING to CURRENT ROW.

spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala

Line 514 in 32a523b

override val frame: WindowFrame = SpecifiedWindowFrame(RowFrame, UnboundedPreceding, CurrentRow)

...src/test/scala/org/apache/spark/sql/catalyst/optimizer/LimitPushdownThroughWindowSuite.scala

SparkQA · 2021-03-10T19:40:18Z

Test build #135938 has finished for PR 31691 at commit 3ec12ad.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun

Hi, @wangyum , @cloud-fan , @maropu , @c21

For safety, if you don't mind, can we have an independent rule, LimitPushDownThroughWindow?

cc @gatorsmile

wangyum · 2021-03-12T01:05:47Z

+1 for add an independent rule. It seems we can pushdown more cases. For example:

Window function with PARTITION BY:

SELECT *, ROW_NUMBER() OVER(PARTITION BY a ORDER BY b) AS rn FROM t LIMIT 10 =>
SELECT *, ROW_NUMBER() OVER(PARTITION BY a ORDER BY b) AS rn FROM (SELECT * FROM t ORDER BY a, b LIMIT 10) tmp

Window function with predicate:

SELECT * FROM (SELECT *, ROW_NUMBER() OVER(PARTITION BY a ORDER BY b) AS rn FROM t ) tmp where rn < 100 LIMIT 10 =>
SELECT *, ROW_NUMBER() OVER(PARTITION BY a ORDER BY b) AS rn FROM (SELECT * FROM t ORDER BY a, b LIMIT 10) tmp

dongjoon-hyun · 2021-03-12T01:06:49Z

Thank you, @wangyum !

c21 · 2021-03-12T02:13:49Z

@dongjoon-hyun if the point is to safely exclude rule through config if we find bug after next release, I am +1 for adding as a separate rule.

btw I think it'd good if we can rename the existing rule as well. Have two rules of LimitPushDown and LimitPushDownThroughWindow is kind of confusing for me. Maybe we can have LimitPushDownThroughUnionAndJoin and LimitPushDownThroughWindow. Similarly I feel same for RemoveNoopOperators and RemoveNoopUnion.

dongjoon-hyun · 2021-03-12T05:39:07Z

Yes, definitely, that was my point.

if the point is to safely exclude rule through config if we find bug after next release, I am +1 for adding as a separate rule.

dongjoon-hyun · 2021-03-12T05:43:02Z

BTW, for the renaming proposal, I'm not sure.

cloud-fan · 2021-03-15T10:07:51Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

+ * Pushes down [[LocalLimit]] beneath WINDOW.
+ */
+object LimitPushDownThroughWindow extends Rule[LogicalPlan] {
+  // The window frame of RankLike and RowNumberLike is UNBOUNDED PRECEDING to CURRENT ROW.


is -> can only be

it's probably better to add an assert below to prove this comment.

dongjoon-hyun · 2021-03-15T18:26:03Z

...src/test/scala/org/apache/spark/sql/catalyst/optimizer/LimitPushdownThroughWindowSuite.scala

+    val originalQuery = testRelation
+      .select(a, b, c,
+        windowExpr(RowNumber(), windowSpec(a :: Nil, c.desc :: Nil, windowFrame)).as("rn"))
+      .limit(20)


To test partitionSpec is not empty independently, we need .limit(2), don't we?

Good catch.

dongjoon-hyun · 2021-03-15T18:30:31Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

 }

+/**
+ * Pushes down [[LocalLimit]] beneath WINDOW.


Shall we add an itemized functionality and limitation here because LimitPushDownThroughWindow has the limited functionality and it's difficult to track by reading the code? It would be helpful when we add a new feature and maintain this optimizer.

Yes, I plan explain it by SQL,

/** * Pushes down [[LocalLimit]] beneath WINDOW. This rule optimizes the following case: * {{{ * SELECT *, ROW_NUMBER() OVER(ORDER BY a) AS rn FROM Tab1 LIMIT 5 ==> * SELECT *, ROW_NUMBER() OVER(ORDER BY a) AS rn FROM (SELECT * FROM Tab1 ORDER BY a LIMIT 5) t * }}} */

dongjoon-hyun · 2021-03-15T18:38:32Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

+  private def supportsPushdownThroughWindow(
+      windowExpressions: Seq[NamedExpression]): Boolean = windowExpressions.forall {
+    case Alias(WindowExpression(_: RankLike | _: RowNumberLike,
+        WindowSpecDefinition(Nil, _,


If you don't mind, can we merge line 636 and 637?

case Alias(WindowExpression(_: RankLike | _: RowNumberLike, WindowSpecDefinition(Nil, _, SpecifiedWindowFrame(RowFrame, UnboundedPreceding, CurrentRow))), _) => true

dongjoon-hyun

In addition, please put this new optimizer in a new file, @wangyum .

SparkQA · 2021-03-16T07:50:56Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40680/

SparkQA · 2021-03-16T08:17:56Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40680/

cloud-fan · 2021-03-17T07:16:05Z

The scala 2.13 failure is unrelated, thanks, merging to master!

Push down limit through window when partitionSpec is empty

d04b678

github-actions bot added the SQL label Mar 1, 2021

Fix

512dd54

c21 reviewed Mar 1, 2021

View reviewed changes

Fix

d1839b9

cloud-fan reviewed Mar 3, 2021

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Mar 3, 2021

View reviewed changes

Address comments

b328375

cloud-fan reviewed Mar 3, 2021

View reviewed changes

fix

eb081c1

wangyum marked this pull request as draft March 5, 2021 02:10

cloud-fan reviewed Mar 8, 2021

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Mar 8, 2021

View reviewed changes

...src/test/scala/org/apache/spark/sql/catalyst/optimizer/LimitPushdownThroughWindowSuite.scala Show resolved Hide resolved

Fix

3ec12ad

dongjoon-hyun reviewed Mar 12, 2021

View reviewed changes

fix

0f1c273

cloud-fan reviewed Mar 15, 2021

View reviewed changes

Address comments

6d5d70a

cloud-fan approved these changes Mar 15, 2021

View reviewed changes

dongjoon-hyun reviewed Mar 15, 2021

View reviewed changes

Address dongjoon-hyun comments

d490573

cloud-fan closed this in c234c5b Mar 17, 2021

wangyum deleted the SPARK-34575 branch March 17, 2021 07:22

leoluan2009 mentioned this pull request Mar 22, 2021

[SPARK-34775][SQL] Push down limit through window when partitionSpec is not empty #31926

Closed

AngersZhuuuu mentioned this pull request May 8, 2021

[SPARK-34775][SQL] Push down limit through window when partitionSpec is not empty #32475

Closed

[SPARK-34575][SQL] Push down limit through window when partitionSpec is empty #31691

[SPARK-34575][SQL] Push down limit through window when partitionSpec is empty #31691

Uh oh!

Conversation

wangyum commented Mar 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

SparkQA commented Mar 1, 2021

Uh oh!

SparkQA commented Mar 1, 2021

Uh oh!

SparkQA commented Mar 1, 2021

Uh oh!

SparkQA commented Mar 1, 2021

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangyum Mar 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 2, 2021

Uh oh!

SparkQA commented Mar 2, 2021

Uh oh!

SparkQA commented Mar 2, 2021

Uh oh!

wangyum commented Mar 3, 2021

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Mar 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangyum commented Mar 3, 2021

Uh oh!

cloud-fan commented Mar 3, 2021

Uh oh!

wangyum commented Mar 3, 2021

Uh oh!

SparkQA commented Mar 3, 2021

Uh oh!

SparkQA commented Mar 3, 2021

Uh oh!

SparkQA commented Mar 3, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangyum Mar 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 4, 2021

Uh oh!

SparkQA commented Mar 4, 2021

wangyum commented Mar 1, 2021 •

edited

Loading

wangyum Mar 2, 2021 •

edited

Loading

cloud-fan commented Mar 3, 2021 •

edited

Loading

wangyum Mar 4, 2021 •

edited

Loading